Overview of System Character Encoding
This section describes the character encoding formats supported
by the FW.
Character encoding is a rule according to which the system stores,
identifies, and uses characters. For the FW, character encoding is
the encoding rule that the FW uses to store configurations such as security policies names
and user names.
In history, multiple encoding formats were generated. The FW supports the following
character encoding formats:
- GBK encoding: default encoding format used by the FW. GBK is developed in
China. It supports simplified and traditional Chinese characters in
addition to ASCII characters such as letters, digits, and symbols.
GBK uses the double-byte storage solution. An ASCII character occupies
one byte; a Chinese character occupies two bytes.
- UTF-8 encoding: universal encoding format that supports the characters
of almost all languages. UTF-8 is a variable-length encoding format.
Each character contains 1 to 6 bytes. An ASCII character occupies
one byte; a complex character (such as Chinese and Japanese) occupies
two to three bytes for storage.
The character encoding format setting on the FW is determined by the
current language environment and the character encoding formats supported
by peripheral systems:
- The system supports GBK by default, meeting the input requirements
of most languages including Chinese and English. The encoding format
needs to be switched to UTF-8 only when characters (such as German
and French) that GBK does not support are used.
- When interconnecting with another system that uses UTF-8 and non-ASCII
characters, the FW must
be switched to UTF-8. For example, if an authentication server stores
user names in the UTF-8 format, the user names cannot be imported
to the FW that uses
the GBK format. As a result, users cannot log in to the FW.