Java character stream example analysis
1. The origin of character stream
Since it is not very convenient to use byte stream to control Chinese, Java provides character stream to control Chinese
Implementation principle: byte stream encoding table
Why is there no problem when using byte stream to copy text files with Chinese characters?
Because the underlying operation will automatically splice bytes into Chinese
How to identify that the byte is Chinese?
When Chinese characters are stored, whether it is UTF-8 or GBK, the first byte is a negative number to prompt
2. Coding table
Character set:
is a collection of all characters supported by the system, including national characters, punctuation marks, graphic symbols, numbers, etc.
To accurately store and recognize various character set symbols, a computer needs to perform character processing Encoding, a set of character sets must have at least one set of character encodings
Common character sets include ASCII character set, GBXXX character set, Unicode character set, etc.
GBK: the most commonly used Chinese code table, It is an extended specification based on the GB2312 standard. It uses a double-byte encoding scheme and contains a total of 21,003 Chinese characters. It is fully compatible with the GB2312 standard and supports traditional Chinese characters, Japanese and Korean Chinese characters, etc.
GB18030: The latest Chinese The code table contains 70244 Chinese characters, using multi-byte encoding. Each character can be composed of 1, 2 or 4 bytes. Supports the characters of Chinese ethnic minorities, as well as traditional Chinese characters, Japanese and Korean Chinese characters, etc.
Unicode character set:
is designed to express any character in any language. It is a standard in the industry, also known as It is Unicode and Standard Universal Code; it uses up to 4 bytes of numbers to express each letter, symbol, or text. There are three encoding schemes: UTF-8, UTF-16, and UTF32. The most commonly used is UTF-8
UTF-8: It can be used to represent any character in the Unicode standard. It is used for emails, web pages, and The preferred encoding used in other applications that store or transfer files. The Internet Working Group requires that all Internet protocols must support the UTF-8 encoding format. It uses one to four bytes to encode each character
UTF-8 encoding rules:
128 US-ASCII characters, only one byte encoding is required
Latin Chinese and other characters require two bytes to encode
Most commonly used characters (including Chinese) use three bytes to encode
Other rarely used UniCode auxiliary characters use four characters Section encoding
Summary: Which rule is used when encoding, and the corresponding rule needs to be used for decoding, otherwise the code will be garbled
3. Encoding and decoding issues in strings
Encoding Method (IDEA):
byte[] getBytes(): Use the platform's default character set to encode the String into a series of bytes, and store the result in a new byte array
byte[] getBytes(String charsetName): Use the specified character set to encode the String into a series of bytes, and store the result in a new byte array
Decoding method (IDEA):
String(byte[]bytes): Constructs a new String by decoding the specified byte array using the platform's default character set
String(byte[]bytes,String charsetName): Constructs a new String by decoding the specified byte array using the platform's default character set Decode the specified byte array to construct a new String
The default encoding format in IDEA is UTF-8
4. Character stream encoding and decoding issues
Character stream abstraction Base class:
Reader: abstract class of character input stream
Writer: abstract class of character output stream
Two classes related to encoding and decoding issues in the character stream:
InputStreamReader: is a bridge from byte stream to character stream: it reads bytes and decodes them into characters using the specified character set. The character set it uses can be specified by name, can be specified explicitly, or can accept the platform's default character set
Constructor:
InputStreamReader( InputStream in) | Create an InputStreamReader using the default character set. |
InputStreamReader(InputStream in, String charsetName) | Create an InputStreamReader that uses a named character set. |
OutputStreamWruter: It is a bridge from character stream to byte stream: it uses a custom character set to encode written characters into bytes. The character set it uses can Specified by name, can be specified explicitly, or can accept the platform's default character set
Construction method:
OutputStreamWriter(OutputStream out) | Create an OutputStreamWriter using the default character encoding. |
OutputStreamWriter(OutputStream out, String charsetName) | Create an OutputStreamWriter that uses a named character set. |
public class ConversionStreamDemo { public static void main(String[] args) throws IOException { //创建一个默认编码格式的InputStreamReader\OutputStreamWriter InputStreamReader ipsr = new InputStreamReader(new FileInputStream("E:\\abc.txt")); OutputStreamWriter opsw = new OutputStreamWriter(new FileOutputStream("E:\\abc.txt")); //写入数据 opsw.write("你好啊"); opsw.close(); //读数据,方式一:一次读取一个字节数据 int ch; while ((ch = ipsr.read()) != -1) { System.out.print((char) ch); } ipsr.close(); } }
四、字符流写数据的五种方法
方法名 | 说明 |
void write(int c) | 写一个字符 |
void write(char[] cbuf) | 写入一个字符数组 |
void write(char[] cbuf,int off,int len) | 写入字符数组的一部分 |
void write(String str) | 写入一个字符串 |
void write(String str,int off,int len) | 写入一个字符串的一部分 |
字符流写数据需要注意缓冲区的问题,如果想要将缓冲区的数据加载出来需要在写入方法后加上刷新方法flush();
前三个方法与字节流写入方法使用相同,这里重点介绍下面两种方式
public class OutputStreamWriterDemo { public static void main(String[] args) throws IOException { //创建一个默认编码格式的OutputStreamWriter对象 OutputStreamWriter opsw=new OutputStreamWriter(new FileOutputStream("E:\\abc.txt")); //方式一:写入一个字节 opsw.write(97); opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法 //方式二:写入一个字符数组 char[]ch={'a','b','c','二'}; opsw.write(ch); opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法 //方式三:写入一个字符数组的一部分 opsw.write(ch,0,2); opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法 //方式四:写入一个字符串 opsw.write("一二三"); opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法 //方式五:写入一个字符串的一部分 opsw.write("三四五",1,2); opsw.flush();//如果需要在文件中立即显示输入的数据,就需要加入刷新方法 } }
五、字符流读数据的两种方法
方法名 | 说明 |
int read() | 一次读取一个字符数据 |
int read(char[] cbuf) | 一次读取一个字符数组数据 |
public class InputStreamReadDemo { public static void main(String[] args) throws IOException { //创建一个默认编码格式的InputStreamReader InputStreamReader ipsr=new InputStreamReader(new FileInputStream("E:\\abc.txt")); //读取数据,方式一一次读取一个字符数据 int ch; while ((ch=ipsr.read())!=-1){ System.out.print((char) ch); } ipsr.close(); //方式二:一次读取一个字符数组数据 char []ch=new char[1024]; int len; while ((len=ipsr.read(ch))!=-1){ System.out.print(new String(ch,0,len)); } ipsr.close(); } }
小结:如果使用默认编码格式的话,那么字符输入流InputStreamReader可以使用子类FileReader来替代,字符输出流OutputStreamWriter可以使用其子类FileWriter来替代,两者在使用默认编码格式的情况下作用一致。
The above is the detailed content of Java character stream example analysis. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Java 8 introduces the Stream API, providing a powerful and expressive way to process data collections. However, a common question when using Stream is: How to break or return from a forEach operation? Traditional loops allow for early interruption or return, but Stream's forEach method does not directly support this method. This article will explain the reasons and explore alternative methods for implementing premature termination in Stream processing systems. Further reading: Java Stream API improvements Understand Stream forEach The forEach method is a terminal operation that performs one operation on each element in the Stream. Its design intention is

PHP is a scripting language widely used on the server side, especially suitable for web development. 1.PHP can embed HTML, process HTTP requests and responses, and supports a variety of databases. 2.PHP is used to generate dynamic web content, process form data, access databases, etc., with strong community support and open source resources. 3. PHP is an interpreted language, and the execution process includes lexical analysis, grammatical analysis, compilation and execution. 4.PHP can be combined with MySQL for advanced applications such as user registration systems. 5. When debugging PHP, you can use functions such as error_reporting() and var_dump(). 6. Optimize PHP code to use caching mechanisms, optimize database queries and use built-in functions. 7

PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

PHP is suitable for web development, especially in rapid development and processing dynamic content, but is not good at data science and enterprise-level applications. Compared with Python, PHP has more advantages in web development, but is not as good as Python in the field of data science; compared with Java, PHP performs worse in enterprise-level applications, but is more flexible in web development; compared with JavaScript, PHP is more concise in back-end development, but is not as good as JavaScript in front-end development.

PHP and Python each have their own advantages and are suitable for different scenarios. 1.PHP is suitable for web development and provides built-in web servers and rich function libraries. 2. Python is suitable for data science and machine learning, with concise syntax and a powerful standard library. When choosing, it should be decided based on project requirements.

Capsules are three-dimensional geometric figures, composed of a cylinder and a hemisphere at both ends. The volume of the capsule can be calculated by adding the volume of the cylinder and the volume of the hemisphere at both ends. This tutorial will discuss how to calculate the volume of a given capsule in Java using different methods. Capsule volume formula The formula for capsule volume is as follows: Capsule volume = Cylindrical volume Volume Two hemisphere volume in, r: The radius of the hemisphere. h: The height of the cylinder (excluding the hemisphere). Example 1 enter Radius = 5 units Height = 10 units Output Volume = 1570.8 cubic units explain Calculate volume using formula: Volume = π × r2 × h (4

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

The reasons why PHP is the preferred technology stack for many websites include its ease of use, strong community support, and widespread use. 1) Easy to learn and use, suitable for beginners. 2) Have a huge developer community and rich resources. 3) Widely used in WordPress, Drupal and other platforms. 4) Integrate tightly with web servers to simplify development deployment.
