PHP regexp pattern problem

Hello,

I am trying to do simple script which should summarize lunch menus from different restaurants.

For one of the websites the data is not formatted very well for me to use a DOM parser so I am trying to use regexp to read the relevant lines. However I am having problem matching multiple lines: For example, this is a part of the HTML page I want to parse:

“”"" HTML Website""""""
Måndag


28/7, Kycklingryta “Flygande Jakob” med bacon, banan & jordnötter



Veg:Spaghetti med quornfärssås & riven ost
Wok:Fläskfilé i sötsursås
Grill:Hamburgare med bröd & pommes
Grill: Kebabtallrik med pommes & tillbehör

"""""""""""""""""""""""""""""""""

So basically what I want to output with my script is:

28/7, Kycklingryta “Flygande Jakob” med bacon, banan & jordnötter
Veg:Spaghetti med quornfärssås & riven ost
Wok:Fläskfilé i sötsursås
Grill:Hamburgare med bröd & pommes
Grill: Kebabtallrik med pommes & tillbehör

However I am not sure how to handle this by a regexp since there are several line breakes etc. I have managed to write the code to output the first line (see code):

[php]

<?php $url="http://www.restauranghusman.se/veckans.html"; $content = file_get_contents($url); $regexp = "/[0-9]+\/[0-9]+[A-Za-z:,& öäå\"]+/"; if (preg_match($regexp, $content, $matches)) { echo $matches[0]; } else { echo "Did not find any match"; } ?>

[/php]

So to summarize the question. How should I write the regexp pattern to output the wanted text and how should the new lines be handled? Also, perhaps there is a better way to do this than with regexp. In that case also please explain which alternative would be better?

Looking forward to your answers.

I’m not even sure you need to use a regex…

I would just replace all the “
” with Carriage return line feeds

[php]$content = str_replace(’
’,’\r\n’’, $content);[/php]

Then strip out all the HTML… Which would make the above statement look like this

[php]$content = strip_tags(str_replace(’
’,’\r\n’’, $content));[/php]

That should leave you close to the result you’re looking for based on the HTML sample you provided.

Sponsor our Newsletter | Privacy Policy | Terms of Service