Banner

Sponsor

Login


Welcome Back!
Guest
Guest

Register

Lost your password?

66 users online



XML / PHP String cut off problem

XML / PHP String cut off problem

Currently viewing this thread: 1 (0 members and 1 guests)


invoke

invoke

Status: Offline!

XML / PHP String cut off problem

Hi all,

I need some help with a problem.

I?ve developed an MS Project import / export component for our software using SAX XML. We take the project file and convert it to XML (average size: 500k), load it up and suck the info out of it to build a replica in our database (SQL Server, Oracle, MySQL).

The problem that I'm having is that at seemingly random points in the XML, strings are being cut off.

For example:
"27-08-2003 5:00:00 PM" becomes "08-2003 5:00:00 PM"

This happens predominately with dates, although there are also some issues with incorrect boolean values and even missing tasks. All things that we can see are correct in the XML file yet when they reach the function to read the data between tags they have become corrupted in some manner (cut off).

Has anyone else experienced this? Or even better knows a way to solve these problems?

Any help would be greatly appriciated.

If this can't be resolved I'll have to consider dumping the entire XML component and use Access conversions of the project files with ODBC instead.

Code examples and snippets are below.

Thanks,
Cam

XML Snippet:

PHP:

<?php

<Task>
  <
TaskUID>19</TaskUID
  <
TaskID>19</TaskID
  <
TaskName>Hire Tradepersons</TaskName
  <
TaskIsSummary>False</TaskIsSummary
  <
TaskIsMilestone>False</TaskIsMilestone
  <
TaskOutlineNumber>1.4.2</TaskOutlineNumber
  <
TaskWBS /> 
  <
TaskPriority>500</TaskPriority
  <
TaskPercentageComplete>10</TaskPercentageComplete
  <
TaskCreationDate>11-08-2003 4:54:00 PM</TaskCreationDate
  <
TaskStart>27-08-2003 8:00:00 AM</TaskStart
  <
TaskFinish>27-08-2003 5:00:00 PM</TaskFinish
  <
TaskBaseStart>27-08-2003 8:00:00 AM</TaskBaseStart
  <
TaskBaseFinish>27-08-2003 5:00:00 PM</TaskBaseFinish
  <
TaskActualStart>27-08-2003 8:00:00 AM</TaskActualStart
  <
TaskActualFinish /> 
  <
TaskDuration>4800</TaskDuration
  <
TaskTotalSlack>68400</TaskTotalSlack
  <
TaskFreeSlack>0</TaskFreeSlack
  <
TaskConstraintType>0</TaskConstraintType
  <
TaskConstraintDate /> 
  </
Task>

?>

Parser code:

PHP:

<?php

# File to process
$file "../files/".$id."/import.xml";

# Initialize parser
$xml_parser xml_parser_create();
xml_set_element_handler($xml_parser"startElement""endElement");
xml_set_character_data_handler($xml_parser"characterData");
xml_parser_set_option ($xml_parserXML_OPTION_CASE_FOLDINGFALSE); 
        
if (!(
$fp fopen($file"r"))){
    die(
"Cannot locate XML data file: $file");
}
        
while (
$data fread($fp4096)){
    
# Replace & with and
    
$data str_replace("&amp;""and""$data");
    if (!
xml_parse($xml_parser$datafeof($fp))){
           die(
sprintf("XML error: %s at line %d"
                
xml_error_string(xml_get_error_code($xml_parser)), 
                
xml_get_current_line_number($xml_parser))
           );
    }
}
        
xml_parser_free($xml_parser);

?>

Function to read data between tags (character_data_handler) snippet:

PHP:

<?php

# Process data between tags
function characterData($parser$data) {
      global 
$currentTag,$projectArray,$taskArray,$linkArray;    

      switch (
$currentTag) {

     case 
"TaskUID":
              unset(
$taskArray[2]);
              
$taskArray[0] = $data;
              break;

       }
}

?>

invoke

invoke

Status: Offline!

Problem solved.

When using SAX, the Parser chops it up into small components and goes though each of those at a time, thus providing performance increases. However! If your data falls between those data cut points you may get some corruption.

So it's simple. Process the entire XML file as one big chunk.

To resolve the missing task problems I had to strip all ' characters from the data and replace any "&" characters with "and".

PHP:

<?php

# Get the file size
$fSize filesize ($file);

# Use the file size in the fread function
while ($data fread($fp$fSize)){
    
# Replace &'s with and and remove any '
    
$data str_replace("&amp;""and""$data");
    
$data str_replace("'""""$data");
    
// error handler
    
if (!xml_parse($xml_parser$datafeof($fp))){
           die(
sprintf("XML error: %s at line %d"
           
xml_error_string(xml_get_error_code($xml_parser)),
           
xml_get_current_line_number($xml_parser)));
     } 
}

?>

Jubba

Jubba

Status: Offline!

With XML you have to make sure that you escape the illegal characters.

', ", <, >, &

They always need to be escaped or it will corrupt your data when trying to parse your XML..

this page: http://www.fawcette.com/vsm/2002_11/online/aspnet_jgoodyear_11_05_02/

has a bit more information about how much trouble they can cause.. Smile and what to escape them to.

Quick Jump:

Main Navigation


Site & Graphic Design by Aeon Tan
Developed by Jeremie Pelletier & Scott Roach


NeverAPI generated this page in 0.0109 seconds.