Overview of System Character Encoding

This section describes the character encoding formats supported by the FW.

Character encoding is a rule according to which the system stores, identifies, and uses characters. For the FW, character encoding is the encoding rule that the FW uses to store configurations such as security policies names and user names.

In history, multiple encoding formats were generated. The FW supports the following character encoding formats:

GBK encoding: default encoding format used by the FW. GBK is developed in China. It supports simplified and traditional Chinese characters in addition to ASCII characters such as letters, digits, and symbols. GBK uses the double-byte storage solution. An ASCII character occupies one byte; a Chinese character occupies two bytes.
UTF-8 encoding: universal encoding format that supports the characters of almost all languages. UTF-8 is a variable-length encoding format. Each character contains 1 to 6 bytes. An ASCII character occupies one byte; a complex character (such as Chinese and Japanese) occupies two to three bytes for storage.

The character encoding format setting on the FW is determined by the current language environment and the character encoding formats supported by peripheral systems:

The system supports GBK by default, meeting the input requirements of most languages including Chinese and English. The encoding format needs to be switched to UTF-8 only when characters (such as German and French) that GBK does not support are used.
When interconnecting with another system that uses UTF-8 and non-ASCII characters, the FW must be switched to UTF-8. For example, if an authentication server stores user names in the UTF-8 format, the user names cannot be imported to the FW that uses the GBK format. As a result, users cannot log in to the FW.