URL Matching Rule
This section describes the URL format, URL rule, and matching mode.
URL Format
The standard format of a URL is protocol://hostname[:port]/path[?query]. Table 1 lists the parameters.
Table 1 URL parametersField
|
Description
|
protocol
|
An application protocol. Currently, the FW supports only HTTP or HTTPS.
|
hostname
|
Domain name or IP address of the Web server.
|
:port
|
(Optional) A port name. Application protocols have default port numbers, for example, the default HTTP port is 80, and the default HTTPS port is 443. If the server uses the default port, you do not need to configure the port number in a URL filtering rule. If the server uses a non-default port, the port number is mandatory in a URL filtering rule.
|
path
|
A directory or file path on the host. The path is a character string that can be separated by slashes (/).
|
?query
|
(Optional) This field is used to transmit parameters to dynamic Web pages.
|
Figure 1 uses "http://www.example.com:8088/news/edu.aspx?name=tom&age=20" as an example:
Figure 1 URL format
- http represents the protocol.
- www.example.com represents the hostname.
- 8088 represents the communication port.
- news/edu.aspx represents the path.
- name=tom&age=20 represents the parameters.
URL Rule and Matching Mode
You can configure URL and domain name rules in whitelist, blacklist, user-defined categories, and predefined categories. The matching scope of URL rules contains the whole URL, whereas that of domain name rules contains only the domain name (or IP address). They are used in the following scenarios:
- If the permitted or blocked URLs are in the domain name format, either URL rules or HOST rules can be configured in most cases. The URL rules and HOST rules have the same filtering effect. For example, permit or block the access to the domain name www.example.com.
- If the permitted or blocked URLs is in the level-2 domain name format and a small number of URLs are configured, either URL rules or HOST rules can be configured. If a large number of URLs are configured, configuring HOST rules is more simple. For example, permit or block the access to the level-2 domain name news.example.com.
- If the permitted or blocked URLs carry directory and parameter information, only URL rules can be configured. For example, permit or block the access to the URL www.example.com/news.
URL matching modes include prefix matching, suffix matching, keyword matching, and exact matching. Both URL rules and HOST rules support these matching modes. Table 2 lists the comparison of the URL matching modes.
Table 2 URL matching modesMode
|
Definition
|
Example
|
Prefix matching
|
Matches URLs that start with the specified character string, for example, www.example*.
|
To control access to all websites starting with www.example, set the URL filtering rule to www.example*.
|
Suffix matching
|
Matches URLs that end with the specified character string, for example, *aspx.
|
To control access to all image web pages under the www.example.com website, set the URL filtering rules to *.jpg, *.jpeg, *.gif, *.png, and *.bmp.
|
Keyword matching
|
Matches URLs that include the specified character string, for example, *sport*.
|
To control access to all websites that contain sport, set the URL filtering rule to *sport*.
|
Exact matching
|
First checks whether the URL matches the specified character string. If no, delete the last directory of the URL and check again. If the URL is still not matched, delete the second last directory and check again until the domain name can match a specific character string, for example, www.example.com.
|
To control access to all websites under the www.example.com domain name, set the URL filtering rule to www.example.com.
|
The priorities of URL matching modes are as follows:
Exact matching > suffix matching > prefix matching > keyword matching
For example, URL
www.example.com/news first matches
www.example.com/news in the following prefix matching rules:
- Exact matching: www.example.com/news
- Prefix matching: www.example.com/*
- Keyword matching: *example*
In the same matching mode, a long matching rule is assigned a higher priority than a short one. For example, URL
www.example.com/news/index.html first matches
www.example.com/news/* in the following prefix matching rules:
- www.example.com/news/*
- www.example.com/*
If the matching rules have the same length in the same matching mode, the action mode is used to determine the rule to which a rule matches. As shown in
Table 3, the two URL rules are in
keyword matching mode and have the same length (4). As for URL
www.example.com/welcome.html:
- If the action mode is Strict, the URL will match the category with a stricter action. In this example, the URL matches category B whose action is block.
- If the action mode is Lenient, the URL will match the category with a looser action. In this example, the URL matches category A whose action is permit.
Table 3 Action modeItem
|
URL Category
|
Control Action
|
*.com*
|
URL category A
|
Permit
|
*html*
|
URL category B
|
Block
|
- In all of the matching modes, the FW removes http:// or https:// from the beginning of the entered character string. In the exact matching mode, the FW adds a slash (/) at the end of character strings that do not contain any slashes (/) after the hostname.
- URLs are case insensitive.