Display a parts of an external website


#1

Is there a way to display partial parts of somebody else’s website?

Example: A news website has all the information I want but I do not want all there links and banners. Can I have something that will search there web page between certain ID tags and display only what is between those ID tags on my website?

Why you ask, because they do not offer a RSS feed at this time.


#2

You can try opening the newspage document in PHP and filter out the unwanted HTML code.

I think fsockopen() should get you on your way :)


#3

Also be careful not to infringe on someone else’s property/copyrights. You should seek permission (or at the very least acknowledge your source).

That being said, there are several ways to obtain the webpages. curl() is one, the fsockopen() is another. I have used exec() and shell_exec() in conjunction with a wget command (on a linux machine) as well. All these methods will return the page (generally the entire page) and you will have to parse the information out yourself (If you know the start and end tags, it should be relatively easy if not tedious).


#4

Auto-importing someone else’s pages also poses the risk of your pages not showing up properly anymore (or not showing up at all) when the other person (without letting you know of course) changes the overall layout of their website.


#5

Thank you for the heads-up. Here is what I’m trying to do: is gather the news from this website and post it on mine.

The URL: http://www.dragracecentral.com/SeriesIndex.asp?Series=NHRA-SUMMIT

The info is between the

Is this possible to only display the info and control the links with a pop-up window?


#6

Naturally, but that would be HTML, rather than PHP ;)
Anchor Tags


#7

You could possibly use something like this, It should draw in the links…
[php]

<?php $GrabURL = "http://www.dragracecentral.com/SeriesIndex.asp?Series=NHRA-SUMMIT"; $GrabStart = "Bottom
"; $GrabEnd = "Top"; $OpenFile = fopen("$GrabURL", "r"); $RetrieveFile = fread($OpenFile, 200000); //- Reduce This To Save Memory $GrabData = eregi("$GrabStart(.*)$GrabEnd", $RetrieveFile, $DataPrint); $DataPrint[1] = str_replace("last 100 items", "", $DataPrint[1]); //- Use this to replace unwanted text or html fclose($OpenFile); echo $DataPrint[1]; ?>

[/php]
Ripper scripts sometimes don’t work since asp and php are generated prior to the user being able to see the page… hence active server page and pre hypertext processor…

fsockopen might be a better way to go though…

Admin Edit: Added [PHP] code tags for readability. Please refer to http://phphelp.com/guidelines.php for posting guidelines.


#8

If I could get this to work this would be awesome. The above returns a blank page unfortunately. Any other code examples?


#9

kraftworkz, What do you have for code? What have you done?

Please provided us with YOUR code sample and we will try and direct you in the appropriate direction as opposed to just fishing for a piece of code.


#10

well, the code I posted was just as an example, You would have to modify probably the starting and ending and possibly the url for that to work.

This also might not work because it is an asp page, You would be best to try to retrieve with Curl…

Here is a way to get a file with curl, It requires pear, but you can find that with a quick search… I think it actually should be able to run without it, but im not positive, I pulled this out of one of my heavily modified xml scripts…

[code]require_once ‘PEAR.php’;
$file = ‘http://www.example.com/page.html’;
/**
* check, if file is a remote file
*/
if (eregi(’^(http|ftp)://’, substr($file, 0, 10))) {

            $tempFile = "Filename.txt";
            $ch = curl_init($file);
            $fp = fopen($tempFile, 'w');
            curl_setopt($ch, CURLOPT_FILE, $fp);
            curl_setopt($ch, CURLOPT_HEADER, 0);
            curl_exec($ch);
            curl_close($ch);
            fclose($fp);

            $file = $tempFile;

        }

[/code][/code]