Regular expression: Works in regex101.com, but not in actual script

Hello dear forum members! :slight_smile:

So, with „explode” in PHP I already managed to split up a string into several arrays which I store into $aufgespalten (see below the arrays I put out):

[code]$aufgespalten[0]

<?xml version="1.0" encoding="utf-8"?> Kommentar segment 6

$aufgespalten[1]
Kommentar segment 7

$aufgespalten[2]
Kommentar segment 8

$aufgespalten[3]
Kommentar segment 9

$aufgespalten[4]
UEsDBBQAAAAIANGqW0lGCniGFAgAAMswAAAQAAAAaXNoeXVoZXMuYzB6LnRtcO1a3W7bNhs+doHe A+cBw7baVrLut3W8OUmzZkiazHLbw4GWaJsNReojqTjehwG7kO3DTj/sEnaWS9mV7CUlWbIlJ5an LcHWtsjP+/vwIV/yJd3ul1cBQ5dEKir4XnO3s9NEhHvCp3yy14z0uL37afPLXveddvvhg4cPGgci
[…][/code]

Thanks to value $key I know exactly that these are the array numbers as stated.

Anyway, I now want to search in the array $aufgespalten[1]. I want to exclude everything between “<Comment “ and “>”. (below the complete string and what I want to filter out)

[php] Kommentar segment 7 [/php]

[php] severity=“Low” user=“PC” date=“2016-10-30T20:42:56.0922708+00:00” version=“1.0”[/php]

That’s why I would use the following regular expression: [php]@<Comment (.*?)>@[/php]

According to regex101 this should also work - https://regex101.com/r/ZDKVwO/1

But when I use it in my code, nothing is being returned, the array is completely empty. Below is the code I use:

[php]
$regex = ‘@<Comment (.*?)>@’;

preg_match($regex, $aufgespalten[1], $treffer);

foreach ($treffer as $key => $value) {
print “
” . "Der durch RegEx gefundene String ist: " . $value . “
”;
}
[/php]

Thank you very much in advance, that would help my work project immensely :slight_smile:

I know that it might be easier dealing with this kind of file (it is a SDLXLIFF file, which is basically an XML) via “$xml=simplexml_load_file” so that I can filter easily for tags etc, but it somehow doesn’t work with the .xliff file to load it into PHP. (Maybe because although it “looks” like an XML the file name is different or something.) Anyway, since I am a complete beginner to PHP, I think it wouldn’t be a bad idea to try out some string functions.

Thank you! :slight_smile:

What does the file look like that you are trying to read? And yes, you should be using an XML reader for this.

Thank you very much for your reply!

The file would look like attached. Please note that I created the “indented” version with a plugin in Sublime Text 3 for better readability. Also, I had to rename them to “txt” as these are the only extensions allowed in this forums. The real extension is “sdlxliff”. Please also find attached a word document to better understand what the source actually means.

Although it looks like an xml and even says “XML” right on the top, I somehow cannot process it, neither with simple_xml_load_file nor with simplexml_load_string.

Warning: simplexml_load_file(): Test_dings.sdlxliff:1640: parser error : Extra content at the end of the document in C:\xampp\htdocs\XMLTest\XMLTest2.php on line 3

Is there a way to still load this with SimpleXML? Or would there be any other XML Reader which could process that? Thank you very much in advance :slight_smile:


Hamlet.txt (53.4 KB)

Hamlet_indented.txt (60.6 KB)

Hamlet.doc (8.7 KB)

I have read this thread several times and still am not clear what data you want exactly.

Forget about your attempted approach to get it. What data do you want? The actual comments?

Whatever you’re doing, it is wrong. I suggest you start here for a simple correct way to parse an XML file. http://www.w3schools.com/php/php_xml_simplexml_get.asp

Hi Kevin,

Thank you very much for your reply!

Actually, there is quite a lot I would like to retrieve.

For each segment with a comment:

[ul][li]The segment idea[/li]
[li]The coment[/li]
[li]The source segment[/li]
[li]The translated segment without additions (but showing also the as deleted marked bits of text)[/li]
[li]The translated segment without deletions (but showing also the as addition marked bits of text)[/li]
[li]The reviewer/date[/li][/ul]

Start here to see the output array. Look over the link I gave you to get the parts you want. But first, toss that code you already have.

<?php $xml=simplexml_load_file("YOUR_XML_FILE.xml") or die("Error: Cannot create object"); echo "
";
print_r($xml);
echo "
"; ?>

Hi Kevin,

Alright, thank you! But as I said, “simplexml_load_file” doesn’t work for me and already gives out an error (see above).

Use the file I attached.

Learn XML Simple. Basic example with your data.

[php]<?php
$xml=simplexml_load_file(“phphelp.xml”) or die(“Error: Cannot create object”);
echo $xml->{‘doc-info’}->{‘rev-defs’}->{‘rev-def’}->attributes()->author;
?>[/php]


phphelp.xml.txt (76.5 KB)

Hi Kevin,

That’s awesome, thank you very much! :slight_smile:

Hi Kevin,

Can you please tell me exactly what you did with the sdlxliff file? When I use yours and your code, it works fine, but when I use another xliff file which I just rename to xml, it does not work.

I just ran your hamlet.txt through an online xml formatter. Somehow you corrupted your file. I didn’t try to debug it.

Do an online Diff with my file and yours to see what is different.

Sponsor our Newsletter | Privacy Policy | Terms of Service