PHP 5.6 default_charset change may break HTML output

An important note for everyone who’s upgrading from PHP 5.4 and PHP 5.5, to PHP 5.6: the PHP default_charset in php.ini changed from “empty” to UTF-8, making UTF-8 the default charset in PHP. This may break HTML output if you try to set a different charset in your HTML head. It may also break functions like htmlentities() and htmlspecialchars. For example:

Breaking HTML output with PHP 5.6 default_charset changed to UTF-8

Suppose you have the following lines in PHP code (don’t ask me why you would want to do this…):

<meta http-equiv="content-type"
  content="text/html; charset=ISO-8859-1"> 
<?php echo "éééééeeeeééé"; ?>

The PHP default charset setting in PHP 5.4 and PHP 5.5 prints the expected string on the screen: éééééeeeeééé.

In PHP 5.6 however, the default_charset is set to UTF-8, changing PHP’s default charset, and PHP 5.6 will always print a Content-Type response header set to UTF-8:

Content-Type: text/html; charset=UTF-8

The PHP default charset set to UTF-8 breaks HTML output and functions like htmlentities() / htmlspecialchars() (PHP bug #61354), because the HTML charset ISO-8859-1 is printed too (creating a double Content-Type response header):

GET -uUsSed
User-Agent: lwp-request/2.07

GET --> 200 OK
Cache-Control: private
Date: Tue, 31 Mar 2015 12:03:50 GMT
Server: Microsoft-IIS/8.0
Content-Length: 141
Content-Type: text/html; charset=UTF-8
Content-Type: text/html; charset=ISO-8859-1
Client-Date: Tue, 31 Mar 2015 12:03:50 GMT
Client-Response-Num: 1

Pro Tip: While going through your PHP config, fix & set correct values for curl.cainfo and openssl.cafile too. Don’t turn off CURLOPT_SSL_VERIFYPEER (or sslverify = false for wp_remote_get() in WordPress).

Interesting:   "htaccess files should not be used for security restrictions" writes:

If omitted, the default value of the encoding varies depending on the PHP version in use. In PHP 5.6 and later, the default_charset configuration option is used as the default value. PHP 5.4 and 5.5 will use UTF-8 as the default. Earlier versions of PHP use ISO-8859-1.

Although this argument is technically optional, you are highly encouraged to specify the correct value for your code if you are using PHP 5.5 or earlier, or if your default_charset configuration option may be set incorrectly for the given input.

How to fix the HTML output in PHP 5.6 – and up (7.0, 7.1)

The most obvious solution to this problem is: don’t set a character set encoding in your HTML meta tag, e.g:

<meta http-equiv="content-type"
  content="text/html; charset=ISO-8859-1">

Other options to set a correct default character set (default_charset) are:

Create an user-defined php.ini to overrule default_charset directive

PHP supports user-defined php.ini files, in which you can overrule some php.ini settings. Neat!

Upload your user-defined php.ini to your webroot containing the following line:

default_charset = ""

This will tell PHP to not send a Content-Type response header set to UTF-8.

Overrule default_charset with ini_set()

And last but not least, you can overrule this setting with PHP’s ini_set() function:

ini_set( default_charset, "" );

Please share this post if you found it useful, thank you! If you have a valuable tip, please let me know and drop me a comment.

4 replies
  1. RemBem
    RemBem says:

    Thank you, this saved my day, fixing an old site suddenly full of questionmarks after upgrading php. All your other website performance articles are also very helpful!


  2. Anonymous
    Anonymous says:

    In PHP 7.0.19, the empty value of the default_charset directive (default_charset = “”) causes an error HTTP 500 “Internal Server Error”.
    You can solve this problem by setting the value to “none”. For example:
    php_value default_charset = none
    php_value default_charset = “none”
    php_value default_charset = ‘none’



Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published.