PHP UTF 8 Help

Hello,

I’m having trouble trying to sort out these strange characters appearing on some webpages, I’ve tried to sort it out but I really don’t know why it’s not working.

If you have a look at http://www.performingartistes.co.uk/artistes/tim-shipman in the third paragraph you’ll see that some strange characters are appearing, the content looks exactly the same in the database but I don’t know why, I’m guessing it’s how it’s been stored initally? It’s an old site which I have recently redesigned but the admin of the site has been using Microsoft Word to write out descriptions and then copy and paste the descriptions over and saved them through the admin panel.

Anyway, I’ve tried doing this to sort it out

In the bootstrap.php file
[php]
header(‘Content-Type: text/html; charset=utf-8’);
[/php]

In the header.php file

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

I’ve also got $db->exec(“SET NAMES ‘utf8’”); after connecting to the database.

What am I missing? I’m really lost with this.

Thanks,
Adam

but the admin of the site has been using Microsoft Word to write out descriptions and then copy and paste the descriptions over and saved them through the admin panel.

Then you are screwed honestly. The character set that MS uses adds their own clutter to the set. It’s kind of like the quotes, they are the same that anything else uses. The only way to change the character it to actually remove it and replace it in the database.

If you really felt like it, you could build a script that would go thru each character and map it to ascii values and if it wasn’t, flag it. for example:

[php]
function is_utf8($str) {
return (bool) preg_match(’//u’, $str);
}

$str = “This “is a quoted” string”;
for( $i = 0; $i < strlen( $str ); $i++) {
if ( is_utf8($str[$i]) ) {
echo “

” . chr(ord($str[$i])) . " : " . ord($str[$i]) . “

”;
} else {
echo "

{$str[$i]}Character not recognized

";
}
}[/php]

Are you sure there’s not an easier way? There’s not a simple way of converting the characters to what they should be?

If you can figure out a way to determine what the original character was, you could replace it with an equivalent.

Sponsor our Newsletter | Privacy Policy | Terms of Service