Need help about preg_replace pattern

milosh · September 3, 2012, 8:36pm

Hello, everybody, this is my first post to this forum.

I have tried to solve this problem by myself but failed.

User can input a string in my html form. I want to “clean” it so it contains only a-z, A-Z, 0-9, space, and only these latin letters ČčĆćĐđŠšŽž.

I thought I know to deal with the first part of my problem
[php]$cleanstring = preg_replace(’/[^a-zA-Z0-9 ]/’, ‘’, $originalstring);[/php]
but unfortunately if, for example, $originalstring = ‘Đakče’, $cleanstring will be ‘272ak269e’. I do not understand why preg_replace will not duplicate ‘Đ’ and ‘č’ chars in return string.
Even if I knew how to solve this previous problem, I would still not know how to include ČčĆćĐđŠšŽž in my preg_replace search pattern. Is there a way to include them one by one as a hex values, or some other solution?

Edit: For test purposes I tried using str_replace(‘Đ’, ‘Ok’, $originalstring), but it does not work, return string is the same as original. Why?

Can anybody help?
Thank you in advance!
milosh

malasho · September 14, 2012, 5:49pm

I see you posted this over a week ago and you have probably solved your issue, if not:

Try using the following instead
[php]$cleanstring = preg_replace(’/[^a-zA-Z0-9ČčĆćĐđŠšŽž ]/’, ‘’, $originalstring);[/php]

Also make sure that you have set an appropriate character set in the document head.

The correct way to set the document character set will depend on which version of HTML you are using. Here is an example for HTML5

<head>
<meta charset="UTF-8" />
<title>Title</title>
</head>

Please let me know if this doesn’t work.