XML to PHP Tree Hierarchy Issue

Ntanel · October 3, 2013, 1:56pm

Hey all,

I am having an issue with my PHP code that reads the outputs of an uncommon XML file produced by Meetup.Com. This is the final hurdle that I have with this code. It is ignoring the tree’s hierarchy and instead of reading the item’s event name, it is instead displaying the venue name. I assume it is because the name tag for the venue displays first. What I need is the it to read both the event name and the venue name. I tried a number of things, including listing the item tags like directory structures ("/name" and “/venue/name”), but the code did not like that… I have returned the code to the last point of functionality. In the end, if I can either get the code to respect the tree’s hierarchy or just have it read the third name tags and define it as $item[event_name], a month-long issue would be resolved. Any help is appreciated. Thanks in advance.

XML Output:

<results>
 <items>
  <item>
   <venue>
    <address_1>5204 Detroit Rd</address_1>
    <state>OH</state>
    <zip>44035</zip>
    <lat>41.424769</lat>
    <repinned>False</repinned>
    <phone>440-934-1713</phone>
    <name>Ctrl Alt Elite Gaming</name>
    <city>Sheffield Village</city>
    <id>1540018</id>
    <country>us</country>
    <lon>-82.081545</lon>
   </venue>
   <status>upcoming</status>
   <description>Cost: $5</description>
   <maybe_rsvp_count>0</maybe_rsvp_count>
   <waitlist_count>0</waitlist_count>
   <updated>1380159785000</updated>
   <group>
    <who>North Coast Gamers</who>
    <join_mode>open</join_mode>
    <urlname>North-Coast-Gamers-Cleveland</urlname>
    <name>Games!  North Coast Gamers - Greater Cleveland Chapter</name>
    <id>319720</id>
    <group_lat>41.5</group_lat>
    <group_lon>-81.6699981689</group_lon>
    </group>
   <yes_rsvp_count>1</yes_rsvp_count>
   <created>1379556175000</created>
   <visibility>public</visibility>
   <name>Pokemon Pre-Regional TCG Tournament</name>
   <id>141025002</id>
   <headcount>0</headcount>
   <duration>12600000</duration>
   <utc_offset>-14400000</utc_offset>
   <time>1380990600000</time>
   <event_url>http://www.meetup.com/North-Coast-Gamers-Cleveland/events/141025002/</event_url>
  </item>
 </items>
</results>

PHP Output:
[php]

<?php function ShowOneRSS($url) { global $rss; if ($rs = $rss->get($url)) { // Original code echo "$rs[title]
\n"; // New code - For testing purposes. echo "Title: $rs[title]
\n"; echo "Description: $rs[description]
\n"; echo "Link: $rs[link]
\n"; echo "Updated: $rs[updated]
\n"; echo "

$item[title]

$item[event_name] - "; $mil = $item[time] / 1000; // Convert time from milliseconds to seconds $tz = $item[utc_offset] / 1000; // Convert timezone from milliseconds to seconds $duration = $item[duration] / 1000; // Convert duration from milliseconds to seconds $time = $mil + $tz; // Convert to U.S. Eastern Timezone $dt = new DateTime("@$time"); // Convert UNIX timestamp to PHP DateTime echo $dt->format('M. j, Y @ g:i A'); // Event date and time start echo " to "; $event_length = $time + $duration; // Adds time and duration $event_end = new DateTime("@$event_length"); // Convert UNIX timestamp to PHP DateTime echo $event_end->format('g:i A'); // Event time end echo "$item[event_datetime] ($item[name])
Sorry, no items found in the RSS file.

\n"; } else { echo "Sorry, it is not possible to reach RSS file: $url\n
"; // you will probably hide this message in a live version } } // =============================================================================== // include lastRSS include "meetup-src.php"; // List of RSS URLs $rss_left = array( 'meetup-rss.xml' ); // Create lastRSS object $rss = new lastRSS; // Set cache dir and cache time limit (5 seconds) // (don't forget to chmod cache dir to 777 to allow writing) $rss->cache_dir = '/security/cache'; $rss->cache_time = 10; // Show all rss files echo "

\n"; foreach ($rss_left as $url) { ShowOneRSS($url); } echo "

"; ?>

[/php]

PHP Source:
[php]

<?php class lastRSS { // ------------------------------------------------------------------- // Public properties // ------------------------------------------------------------------- var $default_cp = 'UTF-8'; var $CDATA = 'nochange'; var $cp = ''; var $items_limit = 0; var $stripHTML = False; var $date_format = ''; // ------------------------------------------------------------------- // Private variables // ------------------------------------------------------------------- var $resultstags = array ('title', 'description', 'link', 'pubDate', 'updated'); var $itemtags = array('title', 'link', 'description', 'name', 'time', 'event_url', 'venue', 'utc_offset', 'duration', 'updated'); var $imagetags = array('title', 'url', 'link', 'width', 'height'); var $textinputtags = array('title', 'description', 'link', 'name'); // ------------------------------------------------------------------- // Parse RSS file and returns associative array. // ------------------------------------------------------------------- function Get ($rss_url) { // If CACHE ENABLED if ($this->cache_dir != '') { $cache_file = $this->cache_dir . '/rsscache_' . md5($rss_url); $timedif = @(time() - filemtime($cache_file)); if ($timedif < $this->cache_time) { // cached file is fresh enough, return cached array $result = unserialize(join('', file($cache_file))); // set 'cached' to 1 only if cached file is correct if ($result) $result['cached'] = 1; } else { // cached file is too old, create new $result = $this->Parse($rss_url); $serialized = serialize($result); if ($f = @fopen($cache_file, 'w')) { fwrite ($f, $serialized, strlen($serialized)); fclose($f); } if ($result) $result['cached'] = 0; } } // If CACHE DISABLED >> load and parse the file directly else { $result = $this->Parse($rss_url); if ($result) $result['cached'] = 0; } // return result return $result; } // ------------------------------------------------------------------- // Modification of preg_match(); return trimed field with index 1 // from 'classic' preg_match() array output // ------------------------------------------------------------------- function my_preg_match ($pattern, $subject) { // start regullar expression preg_match($pattern, $subject, $out); // if there is some result... process it and return it if(isset($out[1])) { // Process CDATA (if present) if ($this->CDATA == 'content') { // Get CDATA content (without CDATA tag) $out[1] = strtr($out[1], array(''', ']]>'=>'')); } elseif ($this->CDATA == 'strip') { // Strip CDATA $out[1] = strtr($out[1], array(''', ']]>'=>'')); } // If code page is set convert character encoding to required if ($this->cp != '') //$out[1] = $this->MyConvertEncoding($this->rsscp, $this->cp, $out[1]); $out[1] = iconv($this->rsscp, $this->cp.'//TRANSLIT', $out[1]); // Return result return trim($out[1]); } else { // if there is NO result, return empty string return ''; } } // ------------------------------------------------------------------- // Replace HTML entities &something; by real characters // ------------------------------------------------------------------- function unhtmlentities ($string) { // Get HTML entities table $trans_tbl = get_html_translation_table (HTML_ENTITIES, ENT_QUOTES); // Flip keys<==>values $trans_tbl = array_flip ($trans_tbl); // Add support for ' entity (missing in HTML_ENTITIES) $trans_tbl += array(''' => "'"); // Replace entities by values return strtr ($string, $trans_tbl); } // ------------------------------------------------------------------- // Parse() is private method used by Get() to load and parse RSS file. // Don't use Parse() in your scripts - use Get($rss_file) instead. // ------------------------------------------------------------------- function Parse ($rss_url) { // Open and load RSS file if ($f = @fopen($rss_url, 'r')) { $rss_content = ''; while (!feof($f)) { $rss_content .= fgets($f, 4096); } fclose($f); // Parse document encoding $result['encoding'] = $this->my_preg_match("'encoding=[\'\"](.*?)[\'\"]'si", $rss_content); // if document codepage is specified, use it if ($result['encoding'] != '') { $this->rsscp = $result['encoding']; } // This is used in my_preg_match() // otherwise use the default codepage else { $this->rsscp = $this->default_cp; } // This is used in my_preg_match() // Parse RESULTS info - Previously CHANNEL preg_match("'(.*?)'si", $rss_content, $out_results);

        foreach($this->resultstags as $resultstag) 
        { 
            $temp = $this->my_preg_match("'<$resultstag.*?>(.*?)</$resultstag>'si", $out_results[1]); 
            if ($temp != '') $result[$resultstag] = $temp; // Set only if not empty 
        } 
        // If date_format is specified and lastBuildDate is valid 
        if ($this->date_format != '' && ($timestamp = strtotime($result['lastBuildDate'])) !==-1) { 
                    // convert lastBuildDate to specified date format 
                    $result['lastBuildDate'] = date($this->date_format, $timestamp); 
        } 

        // Parse TEXTINPUT info 
        preg_match("'<textinput(|[^>]*[^/])>(.*?)</textinput>'si", $rss_content, $out_textinfo); 
            // This a little strange regexp means: 
            // Look for tag <textinput> with or without any attributes, but skip truncated version <textinput /> (it's not beggining tag) 
        if (isset($out_textinfo[2])) { 
            foreach($this->textinputtags as $textinputtag) { 
                $temp = $this->my_preg_match("'<$textinputtag.*?>(.*?)</$textinputtag>'si", $out_textinfo[2]); 
                if ($temp != '') $result['textinput_'.$textinputtag] = $temp; // Set only if not empty 
            } 
        } 
        // Parse IMAGE info 
        preg_match("'<image.*?>(.*?)</image>'si", $rss_content, $out_imageinfo); 
        if (isset($out_imageinfo[1])) { 
            foreach($this->imagetags as $imagetag) { 
                $temp = $this->my_preg_match("'<$imagetag.*?>(.*?)</$imagetag>'si", $out_imageinfo[1]); 
                if ($temp != '') $result['image_'.$imagetag] = $temp; // Set only if not empty 
            } 
        } 
        // Parse ITEMS 
        preg_match_all("'<item(| .*?)>(.*?)</item>'si", $rss_content, $items); 
        $rss_items = $items[2]; 
        $i = 0; 
        $result['items'] = array(); // create array even if there are no items 
        foreach($rss_items as $rss_item) { 
            // If number of items is lower then limit: Parse one item 
            if ($i < $this->items_limit || $this->items_limit == 0) { 
                foreach($this->itemtags as $itemtag) { 
                    $temp = $this->my_preg_match("'<$itemtag.*?>(.*?)</$itemtag>'si", $rss_item); 
                    if ($temp != '') $result['items'][$i][$itemtag] = $temp; // Set only if not empty 
                } 
                // Strip HTML tags and other stuff from DESCRIPTION 
                if ($this->stripHTML && $result['items'][$i]['description']) 
                    $result['items'][$i]['description'] = strip_tags($this->unhtmlentities(strip_tags($result['items'][$i]['description']))); 

                // Strip HTML tags and other stuff from TITLE 
                if ($this->stripHTML && $result['items'][$i]['title']) 
                    $result['items'][$i]['title'] = strip_tags($this->unhtmlentities(strip_tags($result['items'][$i]['title']))); 

                // If date_format is specified and pubDate is valid 
                if ($this->date_format != '' && ($timestamp = strtotime($result['items'][$i]['pubDate'])) !==-1) { 
                    // convert pubDate to specified date format 
                    $result['items'][$i]['pubDate'] = date($this->date_format, $timestamp); 
                } 
                // Item counter 
                $i++; 
            } 
        } 

        $result['items_count'] = $i; 
        return $result; 
    } 
    else // Error in opening return False 
    { 
        return False; 
    } 
}

}

?>
[/php]

Just a heads up, this is a modified version of LastRSS 0.9.1 by Vojtech Semecky. It is a free download, you just have to search for it.

Ntanel · October 3, 2013, 11:09pm

I was unaware of this, but a previous topic that I had posted a week ago was getting responses. The difference between these posts is that this is the complete code.

http://www.phphelp.com/forum/general-php-help/parse-xmlphp-21524/new/#new