Home Web Front-end HTML Tutorial Interpreting HTML: Namespaces and Character Encodings

Interpreting HTML: Namespaces and Character Encodings

Dec 17, 2016 pm 02:10 PM

In the process of working on projects, we often establish various specifications to facilitate better cooperation between teams and better complete the project; similarly, we often hear various agreements, such as Google The IM software Gtalk uses an open In front of the user, they need to use the HTTP protocol.

For the same reason, because browsers have different kernels and render the default style differently, a set of rules that each browser follows is needed to ensure that the same web document is rendered on different browsers. The style is consistent, this rule is the DOCTYPE statement.

Because the Internet is interconnected, any two or more web page documents may involve data exchange, and because the XML language allows users to customize tags, any two exchanged documents may have the same tags, resulting in conflicts of the same tags, so a namespace is needed to distinguish the same tags that may exist in the exchange document.

XHTML, as a transition language from HTML to XML, cannot implement user-defined tags in XML language, so the namespaces in XHMTL documents are the same:

xmlns is XHTML The abbreviation of namespace is the so-called "namespace". Like the DOCTYPE declaration, xmlns is also a type of declaration. Unlike the DOCTYPE statement that still exists in HTML documents, xmlns does not exist in HTML documents. The xmlns we usually see appear in XHTML documents.

When making a web page, in addition to declaring DOCTYPE (document type) at the beginning, if it is an XHTML document, you also need to declare a namespace, and the third thing that needs to be declared is the character encoding type of the web page document:

In order to be correctly interpreted by browsers and validated by W3C, each XHTML document should declare the character encoding used. Many times, garbled characters in web documents are mostly caused by incorrect character encoding.

utf-8 is a variable-length encoding expression of Unicode. As a global character encoding, it is being used by more and more web documents. Web pages using utf-8 character encoding can maximize the Avoid garbled characters caused by different character encodings when users from different regions access the same web page.

But when we open most domestic websites, especially large portal websites, the statement about character encoding is not utf-8, but gb2312:

Of course, in addition to gb2312, there are also some websites that use gbk Or gb18030 encoding, these three character encodings all belong to the Simplified Chinese character set. That is to say, if a computer does not have the Simplified Chinese character set installed, when it accesses a Chinese webpage with the character encoding of gb2312, garbled characters will be displayed.

Since gb2312 character encoding may cause garbled characters due to user access from different regions, why not use utf-8?

One of the reasons may be due to historical reasons, and the other more important reason should be the different document sizes caused by the different storage methods of the two encodings.

When using the gb2312 character encoding set, a Chinese character occupies 2 bytes, but the number of bytes occupied by a Chinese character in UTF-8 encoding is often 3 bytes, or even more than 3 bytes. of bytes. Therefore, for the same Chinese document, the storage volume using gb2312 character encoding is smaller than the document size stored in utf-8 encoding.

For Chinese websites with a lot of text and high traffic, web documents encoded with gb2312 can save a lot of traffic in downloading and transmission. Furthermore, the user groups of Chinese websites are basically locked in Chinese users. , these are the reasons why many websites use gb2312 encoding instead of utf-8 encoding.

However, there are not many websites with a lot of text and high traffic in China. In addition, there may be problems with pairs of garbled characters, so it is recommended to use UTF-8 encoding when making web pages.

Of course, no matter what encoding is used, the most important thing is that the encoding used by the entire site must be unified.

In addition to the above method for declaration of character encoding, you may also see another declaration method:


This declaration method is for older versions of browsers, and browsers have been generally updated. This method of declaration is no longer recommended today.

The above is about interpreting HTML: namespace and character encoding. For more related articles, please pay attention to the PHP Chinese website (www.php.cn)!


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to solve the problem of garbled characters in tomcat logs? How to solve the problem of garbled characters in tomcat logs? Dec 28, 2023 pm 01:50 PM

What are the methods to solve the problem of garbled tomcat logs? Tomcat is a popular open source JavaServlet container that is widely used to support the deployment and running of JavaWeb applications. However, sometimes garbled characters appear when using Tomcat to record logs, which causes a lot of trouble to developers. This article will introduce several methods to solve the problem of garbled Tomcat logs. Adjust Tomcat's character encoding settings. Tomcat uses ISO-8859-1 character encoding by default.

Solve PHP error: The specified namespace class was not found Solve PHP error: The specified namespace class was not found Aug 18, 2023 pm 11:28 PM

Solve PHP error: The specified namespace class was not found. When developing using PHP, we often encounter various error messages. One of the common errors is "The specified namespace class was not found". This error is usually caused by the imported class file not being properly namespace referenced. This article explains how to solve this problem and provides some code examples. First, let’s take a look at an example of a common error message: Fatalerror:UncaughtError:C

How to use namespace in F3 framework? How to use namespace in F3 framework? Jun 03, 2023 am 08:02 AM

The F3 framework is a simple, easy-to-use, flexible and scalable PHPWeb framework. Its namespace (Namespace) mechanism provides us with a more standardized, more readable, and clearer code structure. In this article, we will explore how to use namespaces in the F3 framework. 1. What is a namespace? Namespaces are often used to solve the problem of naming conflicts in PHP. It can encapsulate one or more classes, functions or constants in a namespace, which is equivalent to adding a prefix to them. example

Design ideas and implementation methods of Redis namespace and expiration mechanism Design ideas and implementation methods of Redis namespace and expiration mechanism May 11, 2023 am 10:40 AM

Redis is an open source, high-performance key-value storage database. When using Redis for data storage, we need to consider the design of the key namespace and expiration mechanism to maintain Redis performance and data integrity. This article will introduce the design ideas and implementation methods of Redis' namespace and expiration mechanism. 1. Redis namespace design ideas In Redis, keys can be set arbitrarily. In order to facilitate the management and distinction of different data types, Redis introduces the concept of namespace. Life

Effective method to solve the problem of garbled characters in the eclipse editor Effective method to solve the problem of garbled characters in the eclipse editor Jan 04, 2024 pm 06:56 PM

An effective method to solve the garbled problem of eclipse requires specific code examples. In recent years, with the rapid development of software development, eclipse, as one of the most popular integrated development environments, has provided convenience and efficiency to many developers. However, you may encounter garbled code problems when using eclipse, which brings trouble to project development and code reading. This article will introduce some effective methods to solve the problem of garbled characters in Eclipse and provide specific code examples. Modify eclipse file encoding settings: in eclip

How to handle character encoding conversion exceptions in Java development How to handle character encoding conversion exceptions in Java development Jul 01, 2023 pm 05:10 PM

How to deal with character encoding conversion exceptions in Java development In Java development, character encoding conversion is a common problem. When we process files, network transmissions, databases, etc., different systems or platforms may use different character encoding methods, causing abnormalities in character parsing and conversion. This article will introduce some common causes and solutions of character encoding conversion exceptions. 1. The basic concept of character encoding. Character encoding is the rules and methods used to convert characters into binary data. Common character encoding methods include AS

C++ syntax error: undefined namespace used, how to deal with it? C++ syntax error: undefined namespace used, how to deal with it? Aug 21, 2023 pm 09:49 PM

C++ is a widely used high-level programming language. It has high flexibility and scalability, but it also requires developers to strictly master its grammatical rules to avoid errors. One of the common errors is "use of undefined namespace". This article explains what this error means, why it occurs, and how to fix it. 1. What is the use of undefined namespace? In C++, namespaces are a way of organizing reusable code in order to keep it modular and readable. You can use namespaces to make functions with the same name

How to solve the character encoding problem in Go language How to solve the character encoding problem in Go language Jun 30, 2023 am 09:21 AM

Methods to solve character encoding problems in Go language development In the process of Go language development, character encoding problems are often encountered. Especially when dealing with data input, output, storage and transmission, it is very important to correctly handle character encoding. This article will introduce some methods to solve character encoding problems in Go language development. First, before dealing with the character encoding issue, we need to understand the character encoding standard of the Go language. The Go language uses the Unicode character encoding standard, which is a globally accepted character encoding standard that supports almost

See all articles