php scraper of all headings

Hello:

I am trying to develop a PHP scraper that scrapes all headings h1 to h6 from a web page. The output should show the h tags and their contents. Each line should be something like this

Heading One

,

Heading two

etc. I think I am close but it doses not work. If any one can help me I would appreciate it.

Thanks,
Randy

Here is my code:

[php]

Headings h1 to h6

Find all the headings on a Web Page

<?php if (isset($_POST['chkurl']) && !empty($_POST['chkurl'])) { // process page for headings $url = sprintf('http://www.%s', $_POST["chkurl"]); echo "

Webpage being processed - ".$url."

"; //open the web page as a file $fp = @fopen($url,'r') or die('

Cannot access web page

'); //read the data while(!feof($fp)) { $line = fgets($fp); // check for a heading...with string functions $pattern = '/\(.+)<\/h[0-6]>/'; $matches = preg_match_all($pattern, $line, $found); if ($matches >= 1 ) { $startpos = strpos($line,""); while ($matches > 0) { $heading = substr($line, $startpos, ($endpos - $startpos + 2)); echo "

".$heading."

"; $matches--; $startpos = strpos($line,"",$startpos + 1); } } } // Close the "file" fclose ($fp); } else { ?>
Complete URL below.
www.
<?php } ?>

[/php]

You’ve got this:

[php]
$matches = preg_match_all($pattern, $line, $found);
[/php]

Then this:
[php]
if ($matches >= 1 ) { …
[/php]

You probably want this:

[php]
if(count($found) >= 1){ …
[/php]

stlewis:

I tried the code you supplied but the code still did not work. I think the problem is in the pattern. I am still working on the code. Thank you for all your help.

Thanks,
rman

[php]if(count($found[0]) >= 1){[/php]

Sponsor our Newsletter | Privacy Policy | Terms of Service