Handling international characters in PHP
If you need to handle international characters in PHP, the “PHP The Right Way” says to do something like this …
The mb_internal_encoding(‘UTF-8’) function tells PHP to use UTF-8 encoding. Typically place at the top of PHP script.
The mb_http_output(‘UTF-8’) sets the HTTP output character encoding to UTF-8.
But more important is to have the meta charset=”utf-8″ in your HTML (as shown in the above HTML and explained in this separate tutorial).
Note that if you are using PHP string functions such as substr, strlen, and strpos with international characters, you need to use the corresponding mb_* version of the functions: mb_substr, mb_strlen, and mb_strpos.
For example, in the above, we have a Spanish phrase in which we want to extract the first 14 characters. When we use mb_substr, we get the expected result…
But if we run with with substr, we get incorrect result …
String length counting is incorrect when you use the strlen with international characters. You need to use mb_strlen as demonstrated in the below example…