How to read binary data in Python?
bytes
bytes: A type of character sequence. By comparing dir(str) and dir(bytes), we can see that the properties and methods of the two are very similar, with only a few differences. Therefore, bytes can also have various operation methods on byte sequences like string, such as search (find), length (len), cutting (split), slicing, etc.
The advantage of bytes is that it is a built-in method in Python and does not require the installation of additional third-party modules.
But the disadvantage is also obvious: it can only query a single query, and cannot query multiple required results at one time.
First open the file through the rb mode of open and read the content as bytes type. There is a find() method to find a specific string, but this method can only find the first string index that meets the requirements, and it does not give a single-bit index, but an 8-bit one-byte index. When you need to find multiple matching strings, there is no built-in findall() method. If you want to query multiple, the process will be troublesome. First find the first matching index 1, start with this index 1, query the second matching index 2, and so on until the end of the query.
with open(path, 'rb') as f: datas = f.read() start_char = datas.find(b'Start') # start_char2 = datas.find(b'Start', start_char) end_char = datas.find(b'End', start_char) # end_char2 = datas.find(b'End', start_char2) data = datas[start_char:end_char] print(data)
Note that in the above code, start_char and end_char will appear multiple times, and the times are not necessarily the same. It is necessary to obtain the content between the two indexes, but it can neither be looped nor checked at once. The commented line of code needs to be executed multiple times to obtain the keyword index. Since we don’t know how many start flags there will be in the file data, we don’t know how many times it will be executed. This should be solved by looping, but there seems to be no variable for looping. This makes the problem more complex.
Secondly, since the content between the two signs is obtained, the above process needs to be performed twice. Therefore, the process is even more complicated.
Therefore, finding new methods is completely necessary.
bitstring
bitstring is a three-party package that reads binary files in the form of byte streams.
The first sentence of the bitstring.py file is: This package defines classes that simplify bit-wise creation, manipulation and interpretation of data.
The translation is as follows: This package defines classes that simplify bit-wise creation, manipulation and interpretation of data. Bit-by-bit creation, manipulation, and interpretation of data.
The simple understanding is to directly operate bytes type data.
There are four main categories, as follows:
Bits -- An immutable container for binary data.
BitArray -- A mutable container for binary data.
ConstBitStream -- An immutable container with streaming methods.
BitStream -- A mutable container with streaming methods.
Bits -- An immutable container of binary data.
BitArray -- Mutable container of binary data.
ConstBitStream -- Immutable container with stream methods.
BitStream -- Mutable container with stream methods.
Like bytes, first read the file content, find the keyword index, and slice to obtain the data content.
# update at 2022/05/06 start # from bistring import ConstBitStream, BitStream from bitstring import ConstBitStream, BitStream # update at 2022/05/06 end hex_datas = ConstBitStream(filename=path) # 读取文件内容 start_char = b'Start' start_chars = hex_datas.findall(start_char, bytealigned=True) # 一次找到全部符合的,返回一个生成器 start_indexs = [] for start_char in start_chars: start_indexs.append(start_char) end_char = b'End' end_indexs = [] for start_index in start_indexs: end_chars = hex_datas.find(end_char, start=start_index, bytealigned=True) # 找到第一个符合的,返回元组 for end_char in end_chars: end_indexs.append(end_char) result = [] for i in range(min(len(start_indexs), len(end_indexs))): hex_data = hex_datas[start_indexs[i]:end_indexs[i]] str_data = BitStream.tobytes(hex_data).decode('utf-8') result.append(str_data)
Code analysis, first import the two required classes: ConstBitStream, BitStream. To get the file content, findall() finds all matching string indexes, and find() finds the first matching string index. Take the smaller value of the two lists of start and end, and slice to obtain the data. The type is "bitstring.ConstBitStream". The BitStream.tobytes() method converts it to bytes type. Chinese characters will be garbled, so use decode() to decode and get required string.
The whole process is still concise and continuous. The findall(), find(), and tobytes() methods are used in the code. In addition, there are many small details that need to be paid attention to. For example, if start_indexs is empty, subsequent code should not be executed, and the same is true for end_indexs if it is empty.
It can be seen that the bitstring package is relatively easy to use. According to the needs, there are relatively few methods used. In fact, there are many other methods, choose as needed.
The above is the detailed content of How to read binary data in Python?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

VS Code can run on Windows 8, but the experience may not be great. First make sure the system has been updated to the latest patch, then download the VS Code installation package that matches the system architecture and install it as prompted. After installation, be aware that some extensions may be incompatible with Windows 8 and need to look for alternative extensions or use newer Windows systems in a virtual machine. Install the necessary extensions to check whether they work properly. Although VS Code is feasible on Windows 8, it is recommended to upgrade to a newer Windows system for a better development experience and security.

To run Python code in Sublime Text, you need to install the Python plug-in first, then create a .py file and write the code, and finally press Ctrl B to run the code, and the output will be displayed in the console.

VS Code can be used to write Python and provides many features that make it an ideal tool for developing Python applications. It allows users to: install Python extensions to get functions such as code completion, syntax highlighting, and debugging. Use the debugger to track code step by step, find and fix errors. Integrate Git for version control. Use code formatting tools to maintain code consistency. Use the Linting tool to spot potential problems ahead of time.

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.
