get html

Hello , Our company is looking to make a webscraper for some of our applications , we have the tool built so it outputs the html on the webpage using simpledom,

$html = file_get_html(‘http://xxx.com/search/records/search.x?record=x&Location=x’);

Is there a way to have multiple file_get_html on 1 page?

we tried

$html = file_get_html(‘http://xxx.com/search/records/search.x?record=x&Location=x’);
$html1 = file_get_html(‘http://xxx.com/search/records/search.x?record=x&Location=x’);

etc but it doesnt seem to work, any code that would help would be greatly appreciated

thanks

dan

What is your goal when you want to create 2 identical DOM objects?
You can simple duplicate this object by assigning to another variable:
[php]$html = file_get_html(‘http://xxx.com/search/records/search.x?record=x&Location=x’);
$html1 = $html;
[/php]
or create a reference to this object:
[php]$html2 = & $html;[/php]

Basically we have it displaying the site via file_get_html(‘http://xxx.com/search/records/search.x?record=x&Location=x’); say for terms sake the search term was weather - we then have specific tags within our site that the script filters out so instead of the entire site we have only the data we need.

What the goal with 2 doms is i was hoping to be able to display more than 1 search term or website e.g.

file_get_html(‘http://xxx.com/search/records/search.x?record=weather&Location=x3’);
file_get_html(‘http://xxx.com/search/records/search.x?record=data&Location=x5’);

eatch of our systems have the same div tags for listings so if we could load up more than 1 get_html it would allow us to have all the identical divs from each page

In your first post you said this is not working:
[php]$html = file_get_html(‘http://xxx.com/search/records/search.x?record=weather&Location=x3’);
$html1= file_get_html(‘http://xxx.com/search/records/search.x?record=data&Location=x5’);[/php]

but it should work. You just need to modify your code where you display results stored in $html, so that results from $html and $html1 are combined. I do not know what is your code to display results, but at least you can try this to ensure results for both pages are retrieved:
[php]var_dump($html);
var_dump($html1);[/php]

Hello,

This is the script we are using to search and display

<?php // example of how to use basic selector to retrieve HTML contents include('../simple_html_dom.php'); // get DOM from URL or file $html = file_get_html('http://xxx.com/search/records/search.x?record=weather&Location=x3'); // find all div tags with id=gbar foreach($html->find('div#record') as $e) echo $e->innertext . '
'; // find all span tags with class=gb1 foreach($html->find('div.record') as $e) echo $e->outertext . '
'; ?>

What would i need to change in order to display both

thanks

[php]// get DOM from URL or file
$html = file_get_html(‘http://xxx.com/search/records/search.x?record=weather&Location=x3’);
$html1= file_get_html(‘http://xxx.com/search/records/search.x?record=data&Location=x5’);

// find all div tags with id=gbar
foreach($html->find(‘div#record’) as $e) echo $e->innertext . ‘
’; // <---- find on first page
foreach($html1->find(‘div#record’) as $e) echo $e->innertext . ‘
’; // <-----find on second page

// find all span tags with class=gb1
foreach($html->find(‘div.record’) as $e) echo $e->outertext . ‘
’;
foreach($html1->find(‘div.record’) as $e) echo $e->outertext . ‘
’;[/php]

If you need to sort results, or remove duplicates after merging results from 2 pages - you can do this by populating results for each page into an array, and then using array functions.

Sponsor our Newsletter | Privacy Policy | Terms of Service