Home Backend Development PHP Tutorial Remember the relationship between Chinese and English byte length and encoding in PHP and Java

Remember the relationship between Chinese and English byte length and encoding in PHP and Java

Jul 29, 2016 am 08:56 AM
length quot str unicode

1.PHP

PHP is actually the same as C language, uses ASCII, One char occupies 1 byte, in GBK encoding, one English occupies 1 byte, and one Chinese occupies 2 bytes. However, under UTF-8 encoding, an English character still occupies 1 byte, but a Chinese character occupies 3-4 bytes (usually 3 bytes). This usually allows you to obtain the word length of the string or String interception causes trouble. For example:

<?php
$str = "我爱你Iloveyou";
echo strlen($str); //utf8下是17,GBK下是14,但如果问你$str的字长是多少,或者让你显示前6个字,其余省略号表示,怎么办?
?>
Copy after login

The answers to the above questions can be found online. The easiest way is to use the extension library and use the mb_substr function to intercept.

2.Java

A char in java is 2 bytes. Java uses Unicode, and 2 bytes are used to represent a character. The Unicode encoding of a Chinese or English character occupies 2 bytes, but if other encoding methods are used, the number of bytes occupied by a character is different. For example:

public class Test {
    public static void main(String[] args){
        String str = "我们aaaaa";
        int byte_len = str.getBytes().length;
        int len = str.length();
        System.out.println("字节长度为:" + byte_len);
        System.out.println("字符长度为:" + len);
    }
}
Copy after login

The above example, the output results in GBK are: 9 and 7, but the output results in UTF-8 are: 11 and 7, that is, no matter what is used Encoding, the word lengths obtained using str.length() are all consistent. This method returns the number of characters in the string. Whether it is a Chinese character or an English character, it is regarded as one character.

The above introduces the relationship between Chinese and English byte lengths and encodings in PHP and Java, including aspects of the content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

In-depth understanding of PHP: Implementation method of converting JSON Unicode to Chinese In-depth understanding of PHP: Implementation method of converting JSON Unicode to Chinese Mar 05, 2024 pm 02:48 PM

In-depth understanding of PHP: Implementation method of converting JSONUnicode to Chinese During development, we often encounter situations where we need to process JSON data, and Unicode encoding in JSON will cause us some problems in some scenarios, especially when Unicode needs to be converted When encoding is converted to Chinese characters. In PHP, there are some methods that can help us achieve this conversion process. A common method will be introduced below and specific code examples will be provided. First, let us first understand the Un in JSON

Use java's String.length() function to get the length of a string Use java's String.length() function to get the length of a string Jul 25, 2023 am 09:09 AM

Use Java's String.length() function to get the length of a string. In Java programming, string is a very common data type. We often need to get the length of a string, that is, the number of characters in the string. In Java, we can use the length() function of the String class to get the length of a string. Here is a simple example code: publicclassStringLengthExample{publ

How to convert unicode to Chinese How to convert unicode to Chinese Dec 14, 2023 am 10:57 AM

Unicode is a character encoding standard used to represent various languages ​​and symbols. To convert Unicode encoding to Chinese characters, you can use Python's built-in functions chr() and ord().

Try the method to solve the problem of Chinese garbled characters in Eclipse Try the method to solve the problem of Chinese garbled characters in Eclipse Jan 03, 2024 pm 05:28 PM

Are you troubled by Chinese garbled characters in Eclipse? To try these solutions, you need specific code examples 1. Background introduction With the continuous development of computer technology, Chinese plays an increasingly important role in software development. However, many developers encounter garbled code problems when using Eclipse for Chinese development, which affects work efficiency. Then, this article will introduce some common garbled code problems and give corresponding solutions and code examples to help readers solve the Chinese garbled code problem in Eclipse. 2. Common garbled code problems and solution files

PHP Tutorial: How to Convert JSON Unicode to Chinese Characters PHP Tutorial: How to Convert JSON Unicode to Chinese Characters Mar 05, 2024 pm 06:36 PM

JSON (JavaScriptObjectNotation) is a lightweight data exchange format commonly used for data exchange between web applications. When processing JSON data, we often encounter Unicode-encoded Chinese characters (such as "u4e2du6587") and need to convert them into readable Chinese characters. In PHP, we can achieve this conversion through some simple methods. Next, we will detail how to convert JSONUnico

php提交表单通过后,弹出的对话框怎样在当前页弹出,该如何解决 php提交表单通过后,弹出的对话框怎样在当前页弹出,该如何解决 Jun 13, 2016 am 10:23 AM

php提交表单通过后,弹出的对话框怎样在当前页弹出php提交表单通过后,弹出的对话框怎样在当前页弹出而不是在空白页弹出?想实现这样的效果:而不是空白页弹出:------解决方案--------------------如果你的验证用PHP在后端,那么就用Ajax;仅供参考:HTML code

Python built-in type str source code analysis Python built-in type str source code analysis May 09, 2023 pm 02:16 PM

1The basic unit of Unicode computer storage is the byte, which is composed of 8 bits. Since English only consists of 26 letters plus a number of symbols, English characters can be stored directly in bytes. But other languages ​​(such as Chinese, Japanese, Korean, etc.) have to use multiple bytes for encoding due to the large number of characters. With the spread of computer technology, non-Latin character encoding technology continues to develop, but there are still two major limitations: no multi-language support: the encoding scheme of one language cannot be used in another language and there is no unified standard: for example There are many encoding standards in Chinese such as GBK, GB2312, GB18030, etc. Since the encoding methods are not unified, developers need to convert back and forth between different encodings, and many errors will inevitably occur.

Solve the problem of inconsistent Unicode character set encoding when Java connects to MySQL database Solve the problem of inconsistent Unicode character set encoding when Java connects to MySQL database Jun 10, 2023 am 11:39 AM

With the development of technologies such as big data and cloud computing, databases have become one of the important cornerstones of enterprise informatization. In applications developed in Java, connecting to MySQL database has become the norm. However, in this process, we often encounter a thorny problem - inconsistent Unicode character set encoding. This will not only affect our development efficiency, but also affect the performance and stability of the application. This article will introduce how to solve this problem and make Java connect to the MySQL database more smoothly. 1. Unicode

See all articles