Send Form Data to XML file

ddot222 · May 11, 2012, 3:52pm

Good day folks!

I am just getting start in PHP and really could use some help.
I am trying to export an array into a XML file.

Some Background:
I currently working with a PHP class (Media Grabber) that grabs images from a URL, sorta like facebook & google.I am having issues posting that data after it has been captured.

Here is the process flow

[ul][li]User Enters URL
$imageMediaGrabber will do the heavy lifting
Grab array of images
Export to images.XML[/li]
[li][/li][/ul]

I am stuck after $imageMediaGrabber has the array. Please advise as to how I can export to XML. Please take a look below and let me know what you think. The code in question is 2 or 3 lines from the closing php tag. A million thanks!

[php]<?php
include ‘tut/connect.php’;
require_once dirname(FILE)."/bin/wlWmg.php";
require_once dirname(FILE)."/demo-utils.php";
$url = wlWmgUtils::getArrayValue($_POST, ‘user_url’, wlWmgUtils::getArrayValue($_GET, ‘user_url’, null));
?>

<?php if (isset($_REQUEST['Submit'])) { # THIS CODE TELL MYSQL TO INSERT THE DATA FROM THE FORM INTO YOUR MYSQL TABLE $sql = "INSERT INTO $db_table(user_name,user_url) values ('".mysql_real_escape_string(stripslashes($_REQUEST['user_name']))."','".mysql_real_escape_string(stripslashes($_REQUEST['user_url']))."')"; if($result = mysql_query($sql ,$db)) { echo '

Thank you

Your information has been entered into our database

"'; } else { echo "ERROR: ".mysql_error(); } } else { ?>
URL
<?php } ?> <?php if(!$image_extensions_active) { $image_extensions = null; } //the magic starts here ... //creating the image grabber object by giving some necessary parameters: //$url: url, //$image_extensions: the image extensions to grab (null means all) $imageMediaGrabber = new wlWmgImageGrabber($url, $image_extensions); //set download switch: if the images should be downloaded to localhost $imageMediaGrabber->setDoDownload($download); //limit filter: setting the media limit count to be grabbed if($limit_active) { $imageMediaGrabber->setLimit($limit); } //the processor will grab only the media founded inside the contents of the tag specified here if($tagslice_active) { $imageMediaGrabber->setGrabOnlyFromTagSlice($tagslice); } //grabbing if($url) { $imageMediaGrabber->grab(); //HOW DO I GET THIS EXPORTED TO XML echo '

'.print_r($imageMediaGrabber->getValidMediaNames(true), true).'

'; } ?>[/php]

ErnieAlex · May 12, 2012, 12:54am

Well, I see where you cross-posted on other sites and nobody has helped you.
This is because you are using a PHP class product, and not just PHP and XML.
It is a “canned” program that is NOT standard PHP. But, on the other hand, it was fairly easy
to locate. Here is a link that explains how to capture and download the images. Since I do
not use this program, I can not explain it all. But, in reviewing the page it explains it clearly.
Read it ALL! Hope it does what your project requires…

http://wiseloop.com/wiseloop/php-web-media-grabber/documentation/html/howto-tutorial.html

One note: It is unlawful to download and use or publish copyrighted material. So, capturing
graphics off of other sites is frowned about. Please be careful how you use this software. If someone
who visits the site and “captures” images based on the software on your site, YOU are responsible
for THEIR use of your site. Lot’s of fines and lawsuits in that area… Be careful with this! And, Good luck!

ddot222 · May 12, 2012, 1:17am

Thanks for the words of advise. I am aware of copyright laws. Thanks!
This is for an band of corporate internal sites.

I am not looking for help with the class. [tt][tt]I am looking for any useful information that will guide me in exporting data from PHP to XML. [/tt][/tt]With or without this custom class. I’ve actually read the entire tut on Wiseloop and contacted the developer with my questions. No help there.

Maybe people don’t understand what I’m asking so I’ll revise the post.

ErnieAlex · May 12, 2012, 1:30am

Okay, internal site was NOT mentioned…

Now, you are talking about capturing XML versions of images.
So, all the data you need are in the XML, correct? Usually you store images by saving them to a folder while using their names and XML data storing in a database. So, expand on what the data actually is. I really don’t want to download and install that program just for testing. Let’s see some data, or at least the format of it and we can help you save it…

ddot222 · May 12, 2012, 2:00am

Thanks so much for your speedy reply! It means the world

I didnt realize I could be super descriptive. So here is what I’m trying to do.
My goal is for corporate users to insert their work profile url into a form.
Once submitted. I want to grab their profile pic’s absolute path. (this is where that custom class comes in)
once ive gotten the pic’s location…I would like to drop that in an xml file and
using ajax dynamically create .

There are 60,000 + employess and I dont want to store all those photos.
Trying to be fancy.

Please advise to how you think I can accomplish this. Maybe I’m over thinking.

ErnieAlex · May 12, 2012, 2:19am

First, thanks for the details… Now I understand a little better…

If the corporate users already have a URL pointing to their profile, it is somewhere. Is it 'inhouse" on your server or database? If so, just link to that. If it is wherever they put it, such as Google/Yahoo/AOL/Hotmail/etc… Then, it would be against company policies to save that data… Not secure… So, I will assume it is “inhouse”!

It is VERY easy to store just the URL in a simple database or XML file. Then, to “view” it, you can simply use an iFrame. Just have a frame with a web browser in it and post the URL to that frame. Then, their profile will show on your page, but, really be their page. Did that make sense?

Now on another note, is their profiles actual webpages? You mentioned URL’s. So, if it is a webpage, you can view their page inside your page, or “scrap” the data off of the page and capture parts of it as easy as any other string data. There are lots of ways to capture this data and display it. Is it possible to post a link to see a picture of the data you are talking about or perhaps just post an image of it here? I think we still need a little more info… But, maybe this helps you… Hope so… Good luck!

ddot222 · May 12, 2012, 3:04am

Thanks…

If the corporate users already have a URL pointing to their profile, it is somewhere.
I wish it were that easy! There are multiple servers and hosts.It would be impossible to grab thousands of links! This is why I thought the image grabber class would come it to save the day.

In regards to the iFrames, Looking to build something non-deprecated but they may be a last resort since we don’t need SEO

Now on another note, is their profiles actual webpages? Yes, these are actually webpages.

So, if it is a webpage, you can view their page inside your page, or “scrap” the data off of the page and capture parts of it as easy as any other string data. There are lots of ways to capture this data and display it. Now this sounds cool with Less guess work involved. OMG, this would make page optimization beautiful without all that processing! Would you be able to list a function/variable/class/etc. Thanks so much for your know how.

Is it possible to post a link to see a picture of the data you are talking about or perhaps just post an image of it here? I would love to but it’s quite sensitve

ErnieAlex · May 12, 2012, 3:17am

LOL, sounds like we have a plan… Here is a link that tells you how to “scrape” the data off of a webpage.
It is quite simple and I do not want to repeat it all. This tutorial should get you started.

A couple of notes: First, if these “webpages” that have your “profiles” on them are all the same, this will become a very simple process. You would have to study want is inside of one page and then duplicate it for all of them. If they are different layouts, you would have to create a parser that pulls out the garbage first and then parse the results. As an example, I use a small parsing routine that removes all text between as it is not content, but, programming and is not needed for capturing content. Did that make sense?

I did this for Dictionary.com so my robot A.I. can understand new words that are not in her database. She scrapes their site and pulls the meaning (up to 20) for any new word, stores it in her database of knowledge for future use. If you use a definition a lot, she moves it up as a better memory of the word… Very cool stuff!

So, here is the link. Hope that helps you…
http://www.devblog.co/php-web-page-scraping-tutorial/ Ask any further questions about it…

ddot222 · May 12, 2012, 4:09am

This is SUPER!!! It certainly put me on the right track.
[tt][tt]THANK YOU SO MUCH!!![/tt][/tt]
I now get to have my weekend back!!! ;D
Many many many thanks!

ErnieAlex · May 12, 2012, 5:06am

Glad I could help. Hope it works out for you… Good luck!

ddot222 · May 12, 2012, 5:56am

Dang it…
It worked for about 20 mins until I got stuck on the same scenario.
Trying to grab my URL from the database, grabbing the image and placing it into

I thought I was so close. Whatev’s, I’m just going to use the iFrames. Sighs!
Still thanks for the insight thought. You rock!

ErnieAlex · May 12, 2012, 1:44pm

So, what holds you up at that point? I mean, you have the URL, what is going wrong with the display?
Give a little more info on what is happening. First, you aren’t saving pictures inside the database are you?
That is not good. Also, if you have permission to grab the page, in what manner are you saving the page?
What data are you displaying… It shouldn’t be that hard. At the very least, you can just use a simple PHP
include after stripping out all of the header info…

All programming is possible, just have to figure out the puzzle…

ddot222 · May 12, 2012, 3:40pm

Morning!

Okay here is what’s happening. I followed the tutorial last night but it was based on text files and strings. I have no idea about expressions, I just learned how to fetch data from the database. I literally still have the bottle in my mouth.

I did a Google search for a similar web scraping/harvesting tut with images, the only thing that came up was another custom class (Simple HTML DOM). That worked handsomely, I was excited until I tried to extend the functionally.

The below works like a charm, very fast by the way.
[php]

<?php include 'connect.php'; include('simple_html_dom.php'); $html = file_get_html('http://www.cnn.com'); foreach($html->find('div#pic') as $e) ?>

Your Corp. Pic

<?php echo $e; ?>

[/php]

I need to be able to make $html = file_get_html(‘url’); dynamic.
This needs to talk to the data that was submitted by the user in the database…
The users information is stored in a table called ‘user_info’ with ‘user_name’ & ‘user_url’
My goal is to fetch the url, scrape the image and dump into a preloader so that it can be called in by ajax when needed. My failed attempt below…

[php]

<?php include 'connect.php'; $sql="SELECT * FROM user_info"; $result = mysql_query($sql); while($row = mysql_fetch_array($result)) { echo "

" . $row['user_name'] . "

"; // I would like for the below information to dynamically populate file_get_html('url'); and scrape images echo "

" . $row['user_url'] . "

"; } ?>

[/php]

It all look so beautiful In my head, but I guess in theory not so much
I am sure there is a way easier way to accomplish this I just don’t have the knowledge to do so.
The iFrames are so slow!!! I’m sad

ErnieAlex · May 12, 2012, 4:12pm

Yes, any project is very pretty in the owner’s head! LOL…

So, this: $html = file_get_html(‘http://www.cnn.com’);
is simple to change… You have your lists of URL for each profile…
Just parse thru them assigning $url=“yourdomain.com/someonesprofile.php”, whatever URL for their profile.
Then, SCRAPE the page and find the image tag for their picture. Remember in that page, there will be just a tag for a location of a picture, NOT the actual picture. So, you would be looking for THAT URL’s which you would then save in your database. (Just the SRC part). At this point, you have stolen the link to their picture and have it stored in your database. Then, to use it, you display it on whatever page you are creating by inserting it into an image tag: <img src=“http://”.$pixurl> or whatever you call it
when retrieved from the database…

Did that make sense? The hardest part is the SCRAPE routine. It would need to know which image tag is the person’s profile picture. Most likely it is clearly marked with some other tag or displayed text. You would have to pull a few pages and look at the actual code of each to see how to tell your SCRAPE routine to get the correct image URL.

Hope that helps…

ddot222 · May 12, 2012, 5:02pm

Just parse thru them assigning $url=“yourdomain.com/someonesprofile.php”, whatever URL for their profile.Then, SCRAPE the page and find the image tag for their picture.

This is where I’m having troubles. Would that look something like…

[php]

<?php include 'connect.php'; include('simple_html_dom.php'); // Grab Data $sql="SELECT * FROM user_info"; $result = mysql_query($sql); // Iterate through tables while($row = mysql_fetch_array($result)) { // Please don't judge the syntax to bad, I'm struggling a bit $url = $row['user_url']; $urlList=array(); array_push($url); scrapeThis(); } function scrapeThis($urlList){ $html = file_get_html($url); foreach($html->find('div#pic') as $e); echo "" ; } ?>

[/php]

ErnieAlex · May 12, 2012, 5:16pm

Well, almost… But, try it this way… (No need to build an array, as the query already is one!)
[php]
// Please don’t judge the syntax to bad, I’m struggling a bit
$url = $row[‘user_url’]; //Already got an user url, now scrape it…

$html = file_get_html($url);
foreach($html->find(‘div#pic’) as $e)
{
echo “” ;
}
}
[/php]
Not sure about the scraping function with the “find(‘div#pic’)”, but, the rest should work.
Basically, you pull all of the users info and parse thru each row of users and scrape each user’s url page.
Should work, but, for testing I would add " LIMIT 10" to the query, space-limit-10 so that you only test
with 10 users. If you have 50,000 users, it would run for hours… LOL Good luck!

ddot222 · May 12, 2012, 6:30pm

Always almost there!!!

Warning: file_get_html() [function.file-get-contents]: Filename cannot be empty

I keep running into this error on every aspect, even with the web grabber, it was the same scenario.

This why I think I was trying to iterate through each row and trying to output $row[‘user_url’] into a string to dynamically.

[php]<?php
include ‘connect.php’;
include(‘simple_html_dom.php’);
$result = mysql_query(“SELECT * FROM user_info”);
while($row = mysql_fetch_array($result))
{
$url = $row[‘user_url’];
$html = file_get_html($url);
foreach($html->find(‘div#pic’) as $e)
{
echo “” ;
}
};

?> [/php]

ErnieAlex · May 12, 2012, 7:02pm

That code looks good.

But, are you sure you have a valid URL in the database?
Try echo’ing it so you can see. I bet the first part is not HTTP or quotes or something…
Like this:
[php]
$result = mysql_query(“SELECT * FROM user_info”);
while($row = mysql_fetch_array($result))
{
$url = $row[‘user_url’];
echo “URL = ***” . $url . “***

”; //debugging…delete this line later…
$html = file_get_html($url);
echo "Length of page: " . strlen($html) . “

”; //debugging…delete this line later…
if(strlen($html)>0){
foreach($html->find(‘div#pic’) as $e)
{
echo “” ;
}
}else{
echo “Zero sized URL!!!”;
}
};
[/php]
Look between the stars for odd code. Remember it should be in the format:
“http://www.xyz.com/somefolder/somepage.ext” Any difference could cause no output…
And, I added check for zero size page…

ddot222 · May 12, 2012, 8:34pm

Yea, all the links are absolute paths. no crazy looking ones.Same error comes up. Maybe I should look into a client side scrapper if they have one. This is way over my front-end head

ErnieAlex · May 12, 2012, 9:44pm

So, after trying my last code, you got a displayed URL.
Did it have HTTP:// in front of it? Was it formed correctly?
If you copy it into a browser, where does it take you?

???

OH, PS: a CLIENT-SIDE scrapper, everyone has that built-in. Just RIGHT-CLICK on any blank area of the page, select VIEW-SOURCE. Then, you will see exactly what is being displayed!
LOL… Never heard it called that! I like it!