Compare Two Arrays for Matching Phrases

Hi,

I would like to compare two arrays for matching phrases. I have searched everywhere for a solution. Any suggestions would be

greatly appreciated.

The php code below produces this array…

Array ( [0] => This [1] => tool [3] => compare [4] => two [5] => articles [7] => show [8] => any [9] => duplicate [10] => content

[11] => of [14] => more [15] => word [19] => two [20] => word [21] => phrases [22] => should [23] => be [24] => ignored. )

How do I produce the following array by ignoring 1 & 2 word phrases and only counting 3 or more consecutive matching word

phrases?

Array ( [0] => compare two articles [1] => show any duplicate content of [2] => two word phrases should be ignored )

Thanks in advance,

Gary

[php]// get user input
$str1 = 'This tool will compare two articles and show any duplicate content of 3 or more word phrases. One and two word phrases

should be ignored.’;
$str2 = ‘This tool compare two articles show any duplicate content of more word. two word phrases should be ignored.’;

// explode strings into seperate words
$str1array = explode(" “, $str1);
$str2array = explode(” ", $str2);

// compare duplicate words in the two arrays
$dupwords = array_intersect($str1array, $str2array);
print_r($dupwords);[/php]

Well I created my own method for you. I got the function adv_count_words($str) from http://www.reconn.us/content/view/33/48/ read more about how it works there. The rest I created.

The result of this function means that both $dupwords and $dupwords2 have the same result. Because $str1 and $str2 do not have I will the same and $str3 has I will in a two worded sentence and will be ignored. I have not added any comments, if you like some explaining I be happy to provide some as best as I can.

[php]<?

// get user input
$str1 = ‘This is a test. It really is. Lets add some words to this.’;
$str2 = ‘This is a test. It really is. I will add some words to this.’;
$str3 = ‘This is a test. It really is. I will. add some words to this.’;

// compare duplicate words in the two arrays
$dupwords = FindDuplicate($str1, $str2);
print_r($dupwords);
$dupwords2 = FindDuplicate($str2, $str3);
echo “
”;
print_r($dupwords2);

function FindDuplicate($str1, $str2) {
$array1 = explode(".", $str1);
$filterdArray1 = array();

foreach($array1 as $val) {
	if(adv_count_words($val) > 2) {
		$temp = explode(" ", $val);				
		$filterdArray1 = array_merge($filterdArray1, $temp);
	} 
}

$array2 = explode(".", $str2);	
$filterdArray2 = array();

foreach($array2 as $val) {
	if(adv_count_words($val) > 2) {
		$temp = explode(" ", $val);				
		$filterdArray2 = array_merge($filterdArray2, $temp);
	} 
}

return array_intersect($filterdArray1, $filterdArray2);

}

function adv_count_words($str){
$words = 0;
$str = eregi_replace(" +", " “, $str);
$array = explode(” “, $str);
for($i=0;$i < count($array);$i++)
{
if (eregi(”[0-9A-Za-zÀ-ÖØ-öø-ÿ]", $array[$i]))
$words++;
}
return $words;
}[/php]

Thanks Jkwakkel,

Thank you for helping me with my php. I am learning on the fly and your code will go a long way to solving my problem.

Sponsor our Newsletter | Privacy Policy | Terms of Service